SPREPI: Selective Prediction and REplay for Predicated Instructions

نویسندگان

Nathanaël Prémillieu

André Seznec

چکیده

ARM ISA-based processors are no longer low-cost low-power processors. Nowadays ARM ISA based processor manufacturers are struggling to implement medium-end to high-end processor cores, and this implies implementing a state-of-the-art out-of-order execution engine. Unfortunately providing e cient out-of-order execution on legacy ARM codes may be quite challenging due to predicated instructions. In this paper, we propose a new hardware solution, Selective Prediction and REplay for Predicated Instructions (SPREPI), to provide e cient out-of-order execution of codes featuring predicated instructions. Predicting the predicated instructions addresses the so-called multiple de nition problem. Predicated instructions are predicted using either a global branch-and-predicate history predictor or a global history predictor. But systematic usage of predicate prediction sometimes impairs the performance dramatically. E cient lters are proposed to disable predicate prediction uses when they are likely to be counter-productive. Moreover predicate misprediction penalty can be as high as the branch mispenalty. To reduce this penalty we introduce a speci c selective replay hardware component targeting mispredicted predicated instructions. SPREPI is shown to allow high out-order execution performance on ARM codes generated even with a compiler applying if-conversion only to very short branches. Moreover since SPREPI predicts most of the predicated instructions, a relatively ine cient hardware solution is su cient for executing the few predicated instructions on which prediction is not used. Key-words: prediction, predicated instructions, predication, seletive, replay, out-of-order execution, ARM ∗ IRISA/Université de Rennes 1 † IRISA/INRIA ha l-0 08 56 16 0, v er si on 1 30 A ug 2 01 3 SPREPI : prédiction et rejeu sélectif pour les instructions prédiquées Résumé : Les processeurs basés sur le jeu d'instructions ARM ne sont plus seulement des processeurs à petit budget et à faible consommation. De nos jours, les concepteurs de processeurs ARM cherchent à créer des coeurs d'exécution à performances moyennes voire hautes, impliquant l'implémentation de l'état de l'art d'un moteur d'exécution dans le désordre. Cependant, permettre une exécution dans le désordre e cace peut se révéler di cle sur les codes ARM anciens à cause des instructions prédiquées. Dans cet article, nous proposons une nouvelle solution matérielle, la prédiction et le rejeu sélectif pour les instructions prédiquées (SPREPI) pour permettre une exécution e cace de codes avec des instructions prédiquées. Prédire les instructions prédiquées permet de résoudre le problème appelé le problème des dé nitions multiples. Elles sont prédites en utilisant soit un prédicteur à historique global des branchements et prédicats soit un prédicteur à historique des branchements seul. Cependant, l'utilisation systématique de la prédiction entraîne parfois des pertes de performances dramatiques. Des ltres e caces sont proposés pour désactiver l'utilisation de la prédiction quand il y a des chances qu'elle soit contre-productive. De plus, une mauvaise prédiction de prédicat peut être aussi coûteuse qu'une mauvaise prédiction de branchement. Pour réduire ce coût, nous proposons un mécanisme matériel spéci que de rejeu sélectif ciblant les instructions prédiquées mal prédites. Nous montrons que SPREPI permet une exécution dans le désordre performante sur du code ARM, même lorsque le compilateur n'applique la transformation de if-conversion qu'aux branchements très courts. De plus, comme SPREPI prédit la plupart des instructions prédiquées, une solution matérielle relativement ine cace su t à exécuter les quelques instructions prédiquées pour lesquelles la prédiction n'est pas utilisée. Mots-clés : prédiction, instructions prédiquées, prédication, sélectif, rejeu, exécution danns le désordre, ARM ha l-0 08 56 16 0, v er si on 1 30 A ug 2 01 3

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Practical Selective Replay for Reduced-Tag Schedulers

The trend towards deeper microprocessor pipelines has made it advantageous or necessary to predict the events that may happen in the stages ahead. A widely-used example of this technique is latency speculation, where the non-deterministic latency of some instructions, such as loads, forces dependents to predict the number of clock cycles these operations will take to complete execution. If ther...

متن کامل

Predicting L2 Misses to Increase Issue-Queue Efficacy

The issue queue keeps the instructions that are waiting for the availability of input operands and issue slots. While some instructions remain for a few cycles in the issue queue, the instructions dependent on L2 misses may remain there for hundreds of cycles due to the L2 miss latency. Some authors have proposed mechanisms to extract these instructions from the issue queue. However, these mech...

متن کامل

Predicated Instructions for Code Compaction

Procedural abstraction, the replacement of several identical code sequences with calls to a single representative function, is a powerful tool in producing compact executables. We explore how predicated instructions can be used to allow procedural abstraction of non-identical basic blocks. A predicated instruction is one that the processor executes if a condition (specified in the opcode) is tr...

متن کامل

Selective Guarded Execution Using Pro ling on a Dynamically Scheduled Processor

Modern dynamically scheduled processors use branch prediction hardware to speculatively fetch and execute most likely executed paths in a program. Complex branch predictors have been proposed which attempt to identify these paths accurately such that the hardware can beneet from out-of-order (OOO) execution. Recent studies have shown that inspite of such complex prediction schemes, there still ...

متن کامل

Guarded Execution and Branch Prediction in Dynamic ILP Processors†

In this paper we evaluate the effects of guarded (or conditional, or predicated) execution on the performance of an instruction level parallel processor employing dynamic branch prediction. First, we assess the utility of guarded execution, both qualitatively and quantitatively, using a variety of application programs. Our assessment shows that guarded execution significantly increases the oppo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

SPREPI: Selective Prediction and REplay for Predicated Instructions

نویسندگان

چکیده

منابع مشابه

Practical Selective Replay for Reduced-Tag Schedulers

Predicting L2 Misses to Increase Issue-Queue Efficacy

Predicated Instructions for Code Compaction

Selective Guarded Execution Using Pro ling on a Dynamically Scheduled Processor

Guarded Execution and Branch Prediction in Dynamic ILP Processors†

عنوان ژورنال:

اشتراک گذاری